Europaisches 
Patentamt 



rui/bDU U / u 1 8 



CP 



European 
Patent Office 



Office europeen 
des brevets 



KECD 2 8 JUN 2000 



POT ! 



Bescheinigung Certificate 



Attestation 



Die angehefteten Unterla- 
gen stimmen mit der 
ursprunglich eingereichten 
Fassung der auf dem nach- 
sten Blatt bezeichneten 
europaischen Patentanmel- 
dung uberein. 



The attached documents 
are exact copies of the 
European patent application 
described on the following 
page, as originally filed. 



Les documents fixes a 
cette attestation sont 
conformes a la version 
tnitialement deposee de 
la demande de brevet 
europeen specifiee a ta 
page suivante. 



Patentanmeldung Nr. Patent application No. Demande de brevet n° 

99306931.9 



PRIORITY DOCUMENT 

SUBMITTED OR TRANSMITTED IN 
COMPLIANCE WITH 
RULE 17.1(a) OR (b) 



Der President des Europaischen Patentamts; 
Im Auftrag 

For the President of the European Patent Office 

Le President de I'Office europeen des brevets 
p.o. 




I.L.C. HATTEN-HECKMAN 



DEN HAAG , DEN 

THE HAGUE, 23/05/00 

LA HAYE,LE 



EPA/EPO/OEB Form 1014 - 02.91 





Europaisches 
Patentamt 



European 
Patent Office 



Office europeen 
des brevets 



Blatt 2 der Bescheinigung 
Sheet 2 of the certificate 
Page 2 de I'attestation 



Anmeldung Nr.: 
Application no.: 
Demande n*: 



Anmeldetag: 



99306931.9 



Date of filing: 31/08/99 



Date de depot: 



Anmelder: 

Applicant(s): 

Demandeur(s): 



LUCENT TECHNOLOGIES INC. 

Murray Hill. New Jersey 07974-0636 

UNITED STATES OF AMERICA 



Bezeichnung der Erfindung: 
Title of the invention: 
Titre de I'invention: 

Method and apparatus for macroblock DC and AC coefficient prediction for video coding 



In Anspruch genommene Prioriat(en) / Priority(ies) claimed / Priorite(s) revendiquee(s) 

Staat: Tag: Aktenzeichen: 

State: Date: File no. 

Pays: Date: Numero de depot: 



Internationale Patentklassifikation: 
International Patent classification: 
Classification Internationale des brevets: 



H04N7/26, H04N7/30 



Am Anmeldetag benannte Vertragstaaten: 

Contracting states designated at date of filing: AT/BE/CH/CY/DE^5IC^ES^I/FR/GB/GR/IE/IT/LI/LU/MC/NUPT/SE 
Etats contractants designes lors du depot: 



Bemerkungen: 

Remarks: 

Remarques: 



The original title of the application reads as follows: 
Apparatus for compressing and expanding video data 



EPA/EPO/OEB Form 



1012 -04 98 



THIS PAGE BLANK (usnn) 



APPARATUS FOR COMPRESSING AND EXPANDING VIDEO 

DATA 

This invention relates to apparatus for compressing and expanding video data. 

Existing video compression standards are alt based on block discrete cosine 
transform (DCT) transform. The picture is divided into square blocks consisting of 8x8 
pixels. The blocks may contain the actual pixels or the prediction residual, which is the 
difference between the actual and motion compensated bock pixels. Each block is 
transformed into DCT domain, which results in 8x8 coefficients. 

The DCT process is used to remove the spatial redundancy between the pixels in 
the same block. However, it does not consider the redundancy between the pixels from 
different blocks. The first versions of the standards did not use any technique to exploit 
the correlation between different blocks. Recently, MPEG-4 and H.263+ have added 
tools/options to exploit this redundancy to certain extent. At present, MPEG-4 predicts 
the DC coefficient (first coefficient, which is actually the block average) of the current 
block by using the DC coefficients of the neighbouring blocks. H.263+ does this, and in 
addition, it also predicts the first row or column of the DCT coefficients in some cases if 
there is any benefit. 

In brief, existing compression algorithms exploit the fact that the DCT coefficients 
in the neighbouring blocks are sometimes similar to those in the current block. This 
means that if the blocks contain completely different coefficients, the prediction will not 
work. 

Against this background, there is provided apparatus for coding video data, 
comprising means for receiving pixel values organised in frames each comprising a 
matrix of video blocks, each video block comprising a video matrix of N pixel values, and 
processor means arranged to perform the following steps: 

a) to set each element in a prediction matrix to an initial prediction value; 



b) in the prediction matrix, to apply a smoothing transform to the values 
along the rows and then along the columns, or vice versa, to obtain interpolated values; 

c) to reset the prediction value to the interpolated value; 

d) to calculate the difference between the reset prediction values and 

5 corresponding received pixel values to produce a residual prediction matrix containing 
the prediction residuals; and 

e) to perform a discrete cosine transform on the prediction residuals to 
obtain elements of a compressed video data matrix. 

The processor means is preferably arranged iteratively to calculate the reset 
10 prediction value used to calculate the prediction residual by repeating steps b) and c). 

The number of iterations may be predetermined or, in an alternative, the iterations 
may be repeated until the change in the prediction value between one iteration and the 
next, is less than a predetermined threshold. 

Step a) is most preferably performed by performing a discrete cosine transform on 
1 5 the video matrix to obtain a transform video matrix of N coefficients, selecting n of the 
coefficients, setting the N-n remaining coefficients to zero to obtain an initial prediction 
transform matrix of initial prediction coefficients, and performing an inverse discrete 
cosine transform on the initial prediction transform matrix to obtain a matrix of N initial 
prediction values. 

20 In that case, the processor is preferably arranged to set n of the elements in the 

compressed video data matrix equal to the n coefficients selected from the transform 
video matrix, and to select the remaining H-n coefficients from the prediction residuals. 

The processor is further preferably arranged to adjust the prediction residuals 
before selecting the remaining N -n elements, by: 
25 performing a discrete cosine transform on the reset prediction value 

matrix to obtain a prediction transform matrix, 

g) selecting n coefficients from the transform prediction matrix, 
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h) subtracting the selected n transform prediction matrix coefficients from the 
selected n transform video coefficients to obtain n residual coefficients; 

i) setting n elements of an adjustment transform matrix to the values of the 
n residual coefficients and setting N -n remaining elements to zero; 

5 j) performing an inverse discrete cosine transform on the adjustment 

transform matrix to obtain an adjustment value matrix; and 

k) subtracting the adjustment value matrix from the reset prediction value 
matrix. 

The apparatus may include means for processing pixels in a current and a 
10 previous frame to produce pixel values which are the prediction residual between the 
actual pixel and a motion compensated pixel. 

The invention extends to apparatus for expanding video data compressed by 
apparatus as claimed in any preceding claim, comprising means for receiving the 
compressed video matrix, and processor means arranged to perform the following steps: 
15 a) to perform an inverse discrete cosine transform on received compressed 

video data to obtain a prediction residual matrix; 

b) to set each element in a prediction matrix to the initial prediction value; 

c) in the prediction matrix, to apply a smoothing transform to the values 
along the rows and then along the columns, or vice versa, to obtain interpolated values; 

20 d) to reset the prediction value to the interpolated value; and 

e) to calculate the sum of the reset prediction values and the prediction 
residual in corresponding positions in the received coded block matrix to produce an 
expanded video data matrix. 

Embodiments of the invention will now be described, by way of example, with 
25 reference to the accompanying drawings in which: 



Figures 1A and 1B, when assembled as shown in Figure 1, show a block diagram 
of a transmitter including apparatus for compressing video data embodying the 
invention; and 

Figures 2A and 2B, when assembled as shown in Figure 2, show a block diagram 
5 of a receiver including apparatus for expanding the video data compressed by the 
apparatus of Figure 1. 

A frame of quantised and digitised pixel values is divided into video matrices 
comprising blocks of N pixels where as an example N = 8 x 8. With a switches 1a, 1b 
set to "intra" as illustrated, a video matrix 2 is discrete cosine transformed in step 4 to 
1 0 produce a video transform matrix 6 comprising a block of N discrete cosine transform 
(DCT) coefficients where in the example N = 8 x 8. Of these a square of n coefficients 
are selected in step 8, essentially the DC coefficient and optionally other coefficients. 

In step 10, the remaining N - n (i.e. 8x8-/7) coefficients are set to zero to obtain 
an initial prediction transform matrix 12. The coefficients are inverse discrete cosine 
15 transformed in step 14 to obtain an initial prediction matrix 16. 

In step 18 interpolation is performed between the initial prediction values of matrix 
16 and the values in the neighbouring preceding blocks to reset the prediction matrix. 
Values in a row 20, spatially nearest to the video matrix 2, are used in the interpolation 
process. Linear interpolation is performed between the value in a row/column position in 
20 the initial prediction matrix and the value in a corresponding column in row 20 weighted 
according to the distance in rows from the row 20. 

Similarly values in a column 22, spatially nearest the video matrix 2, are used in 
the interpolation process. Linear interpolation is also performed between the value in a 
row/column position in the initial prediction matrix and the value in a corresponding row 
25 in column 22, weighted by the distance in columns from the column 22. 

Vinterpolated = {2V rc + \t 2 0,c/f + VV.22/C} Va 
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Where, 

Vjnterpoiated is the interpolated prediction value, V rc is the value at row r column c 
of the initial prediction matrix 16, V2o tC is the value in column c of row 20, r is the 
distance in rows of the position r,C from row 20, V r 22 is the value at row r in column 22, 
5 and C is the distance in columns of the position r,C from the column 22. 

The interpolation step 18 is performed iteratively until, in one example, the change 
in values in one step is less than a predetermined threshold. In another example, a 
predetermined fixed number of iterations is performed. 

When the iterations are complete, the reset prediction values are discrete cosine 
10 transformed in step 24 to obtain 8x8 coefficients of a transform prediction matrix 26. In 
step 28 n coefficients are selected and, in step 30 subtracted from the n video transform 
coefficients previously selected in step 8 to produce n residual coefficients. In step 32 
the remaining 8x8-/? coefficients are set to zero to obtain 8x8 adjustment coefficients 
34. These are inverse discrete cosine transformed to produce 8x8 adjustment values. 

15 The values of the reset prediction matrix are adjusted by subtracting from them the 

adjustment values. The values in the video matrix are subtracted from the adjusted 
reset prediction values to obtain a prediction residual matrix 34 of 8 x 8 values. In step 
36, the prediction residual values are discrete cosine transformed to produce a transform 
residual matrix having 8x8 coefficients. Of these n will be zero because of the 

20 adjustment made to the reset prediction matrix. 

The remaining 8 x 8 -n coefficients are selected in step 38 and assembled with the 
n video transform coefficients previously selected in step 8 to provide a compressed 
video matrix of 8 x 8 coefficients. These are channel coded in step 40 and transmitted 
through a medium 42. 
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In the apparatus shown in Figures 2A and 2B, the signal received from the medium * 
42 is channel decoded in step 44 to produce a decoded compressed video data matrix 
46 of 8 x 8 coefficients. Of these, n are selected in step 48 and the remaining 8 x 8 -n 
are set to zero in step 50 to obtain a decoded initial prediction transform matrix 52 
having 8x8 coefficients. The coefficients are inverse discrete cosine transformed to 
produce a + 

decoded initial prediction matrix 54 having 8x8 initial prediction values. 

In step 56, interpolation is performed iteratively on the initial prediction matrix in 
exactly the same manner as was performed in step 18 on the prediction matrix 16 using 
the (decoded) neighbouring row 20 and column 22 to obtain a matrix 58 of reset 
prediction values. 

In step 60, the remaining 8x8-n coefficients of matrix 46 are selected and n 
coefficients are set to zero in step 62 to obtain a decoded transform residual matrix 64 
having 8x8 coefficients. These coefficients are inverse discrete cosine transformed in 
step 66 to obtain a decoded prediction residual matrix having 8x8 residual values. In 
step 68 these are added to the reset prediction values in matrix 58 to produce a decoded 
video matrix 70 containing 8x8 pixel values corresponding to those of matrix 2. 

Putting the switches 1a, 1b in their 'inter position, rearranges the apparatus to 
operate not on the current frame video matrix, but on the residual produced by 
subtracting the values in a motion compensated block of a previous frame, from the 
values in the current frame video matrix 2 in step 74. The motion compensated values 
are added back in step 76 to produce the initial prediction matrix 16 values, and 
subtracted in step 78 from the reset prediction values. 



In the expander shown in Figure 2, motion compensated values obtained in a 
decoded motion compensated video matrix 80 from a previously decoded frame, are 
added back in step 82 to produce the initial prediction matrix. 



CLAIMS 

1. Apparatus for coding video data, comprising means for receiving pixel 
values organised in frames each comprising a matrix of video blocks, each video block 
comprising a video matrix of N pixel values, and processor means arranged to perform 
the following steps: 

a) to set each element in a prediction matrix to an initial prediction value; 

b) in the prediction matrix, to apply a smoothing transform to the values 
along the rows and then along the columns, or vice versa, to obtain interpolated values; 

c) to reset the prediction value to the interpolated value; 

d) to calculate the difference between the reset prediction values and 
corresponding received pixel values to produce a residual prediction matrix containing 
the prediction residuals; and 

e) to perform a discrete cosine transform on the prediction residuals to 
obtain elements of a compressed video data matrix. 

2. Apparatus as claimed in claim 1 , wherein the processor means is 
arranged iteratively to calculate the reset prediction value used to calculate the 
prediction residual by repeating steps b) and c). 

3. Apparatus as claimed in claim 2, wherein the number of iterations is 
predetermined. 

4. Apparatus as claimed in claim 2, wherein the processor means is 
arranged to repeat the iterations until the change in the prediction value between one 
iteration and the next, is less than a predetermined threshold. 

5. Apparatus as claimed in any preceding claim, wherein step a) is 
performed by performing a discrete cosine transform on the video matrix to obtain a 
transform video matrix of N coefficients, selecting n of the coefficients, setting the N-n 
remaining coefficients to zero to obtain an initial prediction transform matrix of initial 




prediction coefficients, and performing an inverse discrete cosine transform on the initial 4 
prediction transform matrix to obtain a matrix of N initial prediction values. 

6. Apparatus as claimed in claim 5, wherein the processor is arranged to set 
n of the elements in the compressed video data matrix equal to the n coefficients 

5 selected from the transform video matrix, and to select the remaining N-n coefficients 
from the prediction residuals. 

7. Apparatus as claimed in claim 6, wherein the processor is arranged to 
adjust the prediction residuals before selecting the remaining N - n elements, by: 

f) performing a discrete cosine transform on the reset prediction value 
1 0 matrix to obtain a prediction transform matrix, 

g) selecting n coefficients from the transform prediction matrix, 

h) subtracting the selected n transform prediction matrix coefficients from the 
selected n transform video coefficients to obtain n residual coefficients; 

i) setting n elements of an adjustment transform matrix to the values of the 
1 5 n residual coefficients and setting N -n remaining elements to zero; 

j) performing an inverse discrete cosine transform on the adjustment 

transform matrix to obtain an adjustment value matrix; and 

k) subtracting the adjustment value matrix from the reset prediction value 
matrix. 

20 8 - Apparatus as claimed in any preceding claim, including means for 

processing pixels in a current and a previous frame to produce pixel values which are 
the prediction residual between the actual pixel and a motion compensated pixel. 

9. Apparatus for expanding video data compressed by apparatus as claimed 
in any preceding claim, comprising means for receiving the compressed video matrix, 
25 and processor means arranged to perform the following steps: 

a) to perform an inverse discrete cosine transform on received compressed 
video data to obtain a prediction residual matrix; 
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b) to set each element in a prediction block matrix to the initial prediction 

value; 

c) in the prediction matrix, to apply a smoothing transform to the values 
along the rows and then along the columns, or vice versa, to obtain interpolated values; 

5 d) to reset the prediction value to the interpolated value; and 

e) to calculate the sum of the reset prediction values and the prediction 
residual in corresponding positions in the received coded block matrix to produce an 
expanded video data matrix. 

10. Apparatus as claimed in claim 8, wherein the processor means is 
10 arranged iteratively to calculate the reset prediction value used to calculate the 

prediction residual by repeating steps b) and c). 

1 1 . Apparatus as claimed in claim 10, wherein the number of iterations is 
predetermined. 

12. Apparatus as claimed in claim 10, wherein the processor means is 

15 arranged to repeat the iterations until the change in the prediction value between one 
iteration and the next, is less than a predetermined threshold. 

13. Apparatus as claimed in any of claims 9 to 12, wherein step a) is 
performed by performing a discrete cosine transform on the video matrix to obtain a 
transform video matrix of N coefficients, selecting n of the coefficients, setting the N-/7 

20 remaining coefficients to zero to obtain an initial prediction transform matrix of initial 

prediction coefficients, and performing an inverse discrete cosine transform on the initial 
prediction transform matrix to obtain a matrix of N initial prediction values. 

14. Apparatus as claimed in claim 13 for expanding video data compressed 
by apparatus as claimed in claim 7, wherein the processor is arranged to select N - n 

25 elements from the compressed video data matrix and to set n elements to zero before 
performing the inverse discrete cosine transform to obtain the prediction residual matrix. 
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ABSTRACT 

APPARATUS FOR COMPRESSING AND EXPANDING VIDEO 

DATA 

Existing video data compression algorithms exploit the fact that the DCT 
coefficients in the neighbouring blocks are sometimes similar to those in the current 
block. This means that if the blocks contain completely different coefficients, the 
prediction will not work. 

Apparatus for coding video data is disclosed in which element in a prediction 
matrix is set to an initial prediction value. In the prediction matrix, a smoothing transform 
is applied to the values along the rows and then along the columns, or vice versa, to 
obtain interpolated values. The prediction value is reset to the to the interpolated value 
and the difference between the reset prediction values and corresponding received pixel 
values is calculated to produce a residual prediction matrix containing the prediction 
residuals. A discrete cosine transform is performed on the prediction residuals to obtain 
elements of a compressed video data matrix. 

The processor means is preferably arranged iteratively to calculate the reset 
prediction value used to calculate the prediction residual by repeating steps b) and c). 



Figure 1a and 1b 
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